Phishing Detection, Prevention, and Notification

ABSTRACT

Phishing detection, prevention, and notification is described. In an embodiment, a messaging application facilitates communication via a messaging user interface, and receives a communication, such as an email message, from a domain. A phishing detection module detects a phishing attack in the communication by determining that the domain is similar to a known phishing domain, or by detecting suspicious network properties of the domain. In another embodiment, a Web browsing application receives content, such as data for a Web page, from a network-based resource, such as a Web site or domain. The Web browsing application initiates a display of the content, and a phishing detection module detects a phishing attack in the content by determining that a domain of the network-based resource is similar to a known phishing domain, or that an address of the network-based resource from which the content is received has suspicious network properties.

RELATED APPLICATION

This application is a continuation of and claims priority to U.S. patentapplication Ser. No. 11/129,222 entitled “Phishing Detection,Prevention, and Notification” filed May 13, 2005 to Goodman et al., thedisclosure of which is incorporated by reference herein.

U.S. patent application Ser. No. 11/129,222 claims priority to U.S.Provisional Application Ser. No. 60/632,649 entitled “Detection,Prevention, and Notification of Fraudulent Email and/or Web Pages” filedDec. 2, 2004 to Goodman et al., the disclosure of which is incorporatedby reference herein.

TECHNICAL FIELD

This invention relates to phishing detection, prevention, andnotification.

BACKGROUND

As the Internet and electronic mail (“email”, also “e-mail”) continuesto be utilized by an ever increasing number of users, so does fraudulentand criminal activity via the Internet and email increase. Phishing isbecoming more prevalent and is a growing concern that can take differentforms. For example, a “phisher” can target an unsuspecting computer userwith a deceptive email that is an attempt to elicit the user to respondwith personal and/or financial information that can then be used formonetary gain. Often a deceptive email may appear to be legitimate orauthentic, and from a well-known and/or trusted business site. Adeceptive email may also appear to be from, or affiliated with, a user'sbank or other creditor to further entice the user to navigate to aphishing Web site.

A deceptive email may entice an unsuspecting user to visit a phishingWeb site and enter personal and/or financial information which iscaptured at the phishing Web site. For example, a computer user mayreceive an email with a message that indicates a financial account hasbeen compromised, an account problem needs to be attended to, and/or toverify the user's credentials. The email will also likely include aclickable (or otherwise “selectable”) link to a phishing Web site wherethe user is requested to enter private information such as an accountnumber, password or PIN information, mother's maiden name, socialsecurity number, credit card number, and the like. Alternatively, thedeceptive email may simply entice the user to reply, fax, IM (instantmessage), email, or telephone with the personal and/or financialinformation that the requesting phisher is attempting to obtain.

SUMMARY

Phishing detection, prevention, and notification is described herein.

In an implementation, a messaging application facilitates communicationvia a messaging user interface, and receives a communication, such as anemail message, from a domain. A phishing detection module detects aphishing attack in the communication by determining that the domain fromwhich the communication is received is similar to a known phishingdomain, or by detecting suspicious network properties of the domain fromwhich the communication is received.

In another implementation, a Web browsing application receives content,such as data for a Web page, from a network-based resource, such as aWeb site or domain. The Web browsing application initiates a display ofthe content, and a phishing detection module detects a phishing attackin the content by determining that a domain of the network-basedresource is similar to a known phishing domain, or that an address ofthe network-based resource from which the content is received hassuspicious network properties.

BRIEF DESCRIPTION OF THE DRAWINGS

The same numbers are used throughout the drawings to reference likefeatures and components:

FIG. 1 illustrates an exemplary client-server system in whichembodiments of phishing detection, prevention, and notification can beimplemented.

FIG. 2 illustrates an exemplary messaging system in which embodiments ofphishing detection, prevention, and notification can be implemented.

FIG. 3 is a flow diagram that illustrates an exemplary method forphishing detection, prevention, and notification as it pertainsgenerally to messaging.

FIG. 4 illustrates an exemplary Web browsing system in which embodimentsof phishing detection, prevention, and notification can be implemented.

FIG. 5 is a flow diagram that illustrates an exemplary method forphishing detection, prevention, and notification as it pertainsgenerally to Web browsing.

FIG. 6 illustrates an exemplary computing device that can be implementedas any one of the devices in the exemplary systems shown in FIGS. 1, 2,and 4.

FIG. 7 is a flow diagram that illustrates another exemplary method forphishing detection, prevention, and notification.

FIG. 8 is a flow diagram that illustrates another exemplary method forphishing detection, prevention, and notification.

FIG. 9 is a flow diagram that illustrates another exemplary method forphishing detection, prevention, and notification.

FIG. 10 illustrates exemplary computing systems, devices, and componentsin an environment that phishing detection, prevention, and notificationcan be implemented.

DETAILED DESCRIPTION

Phishing detection, prevention, and notification can be implemented tominimize phishing attacks by detecting, preventing, and warning userswhen a communication, such as an email, is received from a known orsuspected phishing domain or sender, when a known or suspected phishingWeb site is referenced in an email, and/or when a computer user visits aknown or suspected phishing Web site. A fraudulent or phishing email caninclude any form of a deceptive email message or format that may includespoofed content and/or phishing content. Similarly, a fraudulent orphishing Web site can include any form of a deceptive Web page that mayinclude spoofed content, phishing content, and/or fraudulent requestsfor private, personal, and/or financial information.

In an embodiment of the phishing detection, prevention, andnotification, a history of Web sites visited by a user is checkedagainst a list of known phishing Web sites. If a URL (Uniform ResourceLocator) that corresponds to a known phishing Web site is located in thehistory of visited Web sites, the user can be warned via an emailmessage or via a browser displayed message that the phishing Web sitehas been visited and/or private information has been submitted. In afurther embodiment, the warning message (e.g., an email or messagedisplayed through a Web browser) can contain an explanation that thephishing Web site is a spoof of a legitimate Web site and that thephishing Web site is not affiliated with the legitimate Web site.

The systems and methods described herein also provide for detectingwhether a referenced URL corresponds to a phishing Web site using a formof edit detection where the similarity of a fraudulent URL is comparedagainst known and trusted URLs. Accordingly, the greater the similaritybetween a fraudulent URL for a phishing Web site and a URL for alegitimate Web site, the more likely it is that the fraudulent URLcorresponds to a phishing Web site.

While aspects of the described systems and methods for phishingdetection, prevention, and notification can be implemented in any numberof different computing systems, environments, and/or configurations,embodiments of phishing detection, prevention, and notification aredescribed in the context of the following exemplary system architecture.

FIG. 1 illustrates an exemplary client-server system 100 in whichembodiments of phishing detection, prevention, and notification can beimplemented. The client-server system 100 includes a server device 102and any number of client devices 104(1−N) configured for communicationwith server device 102 via a communication network 106, such as anintranet or the Internet. A client and/or server device may beimplemented as any form of computing or electronic device with anynumber and combination of differing components as described below withreference to the exemplary computing device 400 shown in FIG. 4, andwith reference to the exemplary computing environment 1000 shown in FIG.10.

In an implementation of the exemplary client-server system 100, any oneor more of the client devices 104(1−N) can implement a messagingapplication to generate a messaging user interface 108 (shown as anemail user interface in this example) and/or a Web browsing applicationto generate a Web browser user interface 110 for display on a displaydevice (e.g., display device 112 of client device 104(N)). A Webbrowsing application can include a Web browser, a browser plug-in orextension, a browser toolbar, or any other application that may beimplemented to browse the Web and Web pages. The messaging userinterface 108 and the Web browser user interface 110 facilitate usercommunication and interaction with other computer users and devices viathe communication network 106.

Any one or more of the client devices 104(1−N) can include various Webbrowsing application(s) 114 that can be modified or implemented tofacilitate Web browsing, and which can be included as part of a datapath between a client device 104 and the communication network 106(e.g., the Internet). The Web browsing application(s) 114 can implementvarious embodiments of phishing detection, prevention, and notificationand include a Web browser application 116, a firewall 118, an intranetsystem 120, and/or a parental control system 122. Any number of othervarious applications can be implemented in the data path to facilitateWeb browsing and to implement phishing detection, prevention, andnotification.

The system 100 also includes any number of other computing device(s) 124that can be connected via the communication network 106 (e.g., theInternet) to the server device 102 and/or to any number of the clientdevices 104(1−N). In this example, a computing device 124 hosts aphishing Web site that an unsuspecting user at a client device 104 maynavigate to from a selectable link in a deceptive email. Once at thephishing Web site, the unsuspecting user may be elicited to providepersonal, confidential, and/or financial information (also collectivelyreferred to herein as “private information”). Private informationobtained from a user is typically collected at a phishing Web site(e.g., at computing device 124) and is then sent to a phisher at adifferent Web site or via email where the phisher can use the collectedprivate information for monetary gain at the user's expense.

FIG. 2 illustrates an exemplary messaging system 200 in whichembodiments of phishing detection, prevention, and notification can beimplemented. The system 200 includes a data center 202 and a clientdevice 204 configured for communication with data center 202 via acommunication network 206. The system 200 also includes a phishing Website 208 connected via the communication network 206 to the data center202 and/or to the client device 204.

In an embodiment, data center 202 can be implemented as server device102 shown in FIG. 1, any number of the client devices 104(1−N) can beimplemented as client device 204, and computing device 124 can beimplemented as phishing Web site 208. The data center 202 and/or theclient device 204 may be implemented as any form of a computing orelectronic device with any number and combination of differingcomponents as described below with reference to the exemplary computingdevice 600 shown in FIG. 6, and with reference to the exemplarycomputing environment 1000 shown in FIG. 10.

The client device 204 is an example of a messaging client that includesmessaging application(s) 210 which may include an email application, anIM (Instant Messaging) application, and/or a chat-based application. Amessaging application 210 generates a messaging user interface (e.g.,email user interface 108) for display on a display device 212. In thisexample, client device 204 may receive a deceptive or fraudulent email214, and a user interacting with client device 204 via an emailapplication 210 and the user interface 108 may be enticed to navigate216 to a fraudulent or phishing Web page 218 hosted at the phishing Website 208. When a user selects a link within a phishing email and is thendirected to the phishing Web page 218 via client device 204, a phishercan then obtain private information corresponding to the user, and usethe information for monetary gain at the user's expense.

Client device 204 includes a detection module 220 that can beimplemented as a component of a messaging application 210 to implementphishing detection, prevention, and notification. The detection module220 can be implemented as any one or combination of hardware, software,firmware, code, and/or logic in an embodiment of phishing detection,prevention, and notification. Although detection module 220 for isillustrated and described as a single module or application, thedetection module 220 can be implemented as several componentapplications distributed to each perform one or more functions ofphishing detection, prevention, and notification. Further, althoughdetection module 220 is illustrated and described as communicating withthe data center 202 which includes a list of known phishing domains 222,as well as a false positive list 224 of known legitimate domains, thedetection module 220 can be implemented to incorporate the lists 222 and224.

Detection module 220 can be implemented as integrated code of amessaging application 210, and can include algorithm(s) for thedetection of fraudulent and/or deceptive phishing communications and/ormessages, such as emails for example. The algorithms can be generatedand/or updated at the data center 202, and then distributed to theclient device 204 as an update to the detection module 220. An update tothe detection module 220 can be communicated from the data center 202via communication network 206, or an update can be distributed viacomputer readable media, such as a CD (compact disc) or other portablememory device.

Detection module 220 associated with a messaging application 210 isimplemented to detect phishing when a user interacts with the messagingapplication 210 through a messaging application user interface (e.g.,email user interface 108 shown in FIG. 1). Detection module 220associated with the messaging application 210 implements features forphishing detection, prevention, and notification of fraudulent,deceptive, and/or phishing communications and messages, such as emailsfor example.

Detection module 220 for messaging application 210 can detect numerousaspects of a phishing message or email. For example, the data or name ina “From” field of an email can appear to be from a legitimate domain orWeb site such as “DistricBank.com”, but with a similar name substitutionsuch as “DistricBanc.com”, “DistricBank.net”, “DistricBank.org”,“D1str1cBank.com”, and the like. User-selectable links to phishing Websites or other network-based resources included in a phishing emailmessage can also be obscured in these and other various ways.

Data center 202 maintains the list of known phishing domains 222, aswell as the false positive list 224 of known legitimate domains (i.e.,known false positives) that have been deemed safe for user interaction.The false positive list 224 is a list of entities which have erroneouslybeen marked bad, but are in fact good domains. The data center 202 mayalso maintain a whitelist of known false positives which is a list ofthings known to be good which may or may not have ever been marked asbad. In both cases, the entries in the list(s) are all good, but thefalse positive list 224 is more restrictive about how and/or whatelements are included in the list.

A known phishing domain can be either a known target of phishing attacks(e.g. a legitimate business that phishers imitate), or a domain known tobe a phishing domain, such as a domain that is implemented by phishersto steal information. The list of known phishing domains 222 includes alist of known bad URLs (e.g., URLs associated with phishing Web sites)and a list of suffixes of the known bad URLs. For example, if“www.DistricBanc.com” is a known phishing domain, then a suffix“districbanc.com” may also be included in the list of known phishingdomains 222. In addition, the list of known phishing domains 222 mayalso include a list of known good (or legitimate) domains that arefrequently targeted by phishers, such as “DistricBank.com”.

The data center 202 publishes the list of known phishing domains 222 tothe client device 204 which maintains the list as a cached list 226 ofthe known phishing domains. The data center 202 may also publish a listof known non-phishing domains (not shown) to the client device 204 whichmaintains the list as another of the cached list(s). In an alternateimplementation, the client device 204 queries the data center 202 beforeeach domain is visited to determine whether the particular domain is aknown or suspected phishing domain. A response to such a query can alsobe cached. If a user then visits or attempts to visit a known orsuspected phishing domain, the user can be blocked or warned. However,the list of known phishing domains 222 may not be updated quicklyenough. In some instances, a user may receive a fraudulent or phishingmessage from phishing domain (e.g., from the phishing Web site 208)before the list of known phishing domains 222 is updated at data center202 to include the phishing Web site 208, and before the list ispublished to the client device 204.

The client device 204 includes a message history 228 which wouldindicate that a user has received a suspected fraudulent or phishingmessage, such as an email, while interacting through client device 204and a messaging application 210. After the list of known phishingdomains 222 is updated at the data center 202 and/or after the datacenter 202 publishes the list of known phishing domains 222 to theclient device 204, the message history 228 can be compared to the listof known phishing domains 222 and/or to the cached list 226 of the knownphishing domains to determine whether the user has unknowingly receiveda fraudulent or phishing message or email.

If it is determined after the fact that a fraudulent or phishing messagehas been received, a warning message can be displayed to inform the userof the suspected fraudulent message. The user can then make an informeddecision about what to do next, such as if the user replied to themessage and provided any personal or financial information. This cangive the user time to notify his or her bank, or other related business,of the information disclosure and thus preclude fraudulent use of theinformation that may result from the disclosure of the privateinformation.

A phishing attack, or similar inquiry from a deceptive email, may notdirect a user to a phishing Web site. Rather, an unsuspecting user maybe instructed in the message to call a phone number or to fax personalinformation to a number that has been provided for the user in themessage. There may also be phishing attacks that ask the user to send anemail to an address associated with a phisher. If the user has receivedand previewed any such deceptive messages, the user can be warned afterreceiving the message, but before responding to the deceptive requestfor personal and/or financial information corresponding to the user. Inthe case of a phishing attack that directs the user to send a message(e.g., an email) with personal information, the detection module 220 forthe messaging application 210 can also determine whether the user isattempting to send a message to a suspected or known fraudulent orphishing domain (e.g., phishing Web site 208), and/or can determinewhether such a message has been sent. Ideally, the user can be warnedbefore sending a message, but in some cases, a deceptive message may notbe detected until after the user has sent a response.

The detection module 220 can detect a deceptive, fraudulent, or phishingemail by examining the message content to determine a context of theemail message, such as whether the message includes reference(s) tosecurity, personal, and/or financial information. Further, a message canbe examined to detect or determine whether it contains a suspicious URL,is likely to confuse a user, or is usually emailed out as spam tomultiple recipients.

A user can also be warned of suspected phishing activity when replyingto a suspicious or known fraudulent email message, or when sending anemail communication to a suspected or known fraudulent address. The usercan be warned directly at the client device 204, and/or if detectionoccurs at least in part at a data center 202 and/or at an associatedemail server, then data center 202 (and/or the associated email server)can send a warning message to a mailbox of the user with an indicationas to why a particular email message is suspected of being deceptive orfraudulent.

Conventional anti-phishing tools simply indicate to a user that amessage is fraudulent or not fraudulent. However, in many cases, anindicator can be suspicious without being definitive. Descriptivewarning messages allow for more aggressive detection, and are intendedto provide sufficient information so that a user can use his or herknowledge and judgment about a likely fraudulent email. For example, auser can be warned with messages such as “Warning: this message is fromDistricbank-Security.com, which, to the best of our knowledge, is notaffiliated with Districbank.com. Please use caution if a messagerequests information about a DistricBank account”, or “Warning: Notethat this message is from DistricBanc.com and is not affiliated or fromDistricBank.com. Please use caution if this message requests informationabout a DistricBank account.” In this example, the warning messageemphasizes the domain differences for the user by underlining thealtered letters to indicate the likelihood of confusion. Any otherform(s) of emphasis, such as “bold” or a “highlight”, can also beutilized to emphasize a warning message.

A user can also be warned about specific user-selectable navigationlinks in an email message. For example, an IP (Internet Protocol)address may be included in an email rather than a domain name becausethe domain name would have to be registered, and is likely traceable tothe phisher that registered the domain name. A user can be warned whenclicking on an IP address link included in an email message with awarning such as “Warning: the link you clicked on is an IP address. Thiskind of link is often used by phishing scams. Be cautious if a Web pageasks you for any personal or financial information.” This type ofwarning provides a user with enough information to make an informeddecision rather than relying on a simple “yes” or “no” from a phishingtool that does not provide sufficient information as to the reason(s)for the decision.

The detection module 220 can be implemented to detect various deceptiveand/or fraudulent aspects of messages, such as emails. An example is amismatch of the link text and the URL corresponding to a phishing Website that a user is being requested, or enticed, to visit. A Web sitelink can appear as http://www.DistricBank.com/security having the linktext “DistricBank”, but which directs a user to a Web site,“StealYourMoney.com”. Another common deception is a misuse of the “@”symbol in a URL. For example, a URLhttp://www.DistricBank.com@stealyourmoney.com directs a user to a Website “StealYourMoney.com”, and not to “DistricBank.com”.

The detection module 220 can also be implemented to detect a URL thathas been encoded to obfuscate the URL. For example, hexadecimalrepresentations can be substituted for other characters in a URL suchthat DistricB%41nk.com is equivalent to DistricBank.com, and such thatDictricBanc.com.%41%42%43%44evil.com is equivalent to the URLDistricBanc.com.abcdevil.com, although some users may not notice thepart of the URL after the first “.com”. Some character representationsare expected, such as an “_” (underscore), “˜” (tilde), or othercharacter that may be encoded in a URL for a legitimate reason. However,encoding an alphabetic, numeric, or similar character may be detected asfraudulent, and detection module 220 can be implemented to initiate awarning to a user that indicates why a particular selectable link, URL,or email address is likely fraudulent.

Detectable features of deceptive or fraudulent phishing emails includeone or more of an improper use of the “@” symbol, use of deceptiveencoding, use of an IP address selectable link, use of a redirector, amismatch between link text and the URL, and/or any combination thereof.Other detectable features of deceptive or fraudulent phishing includedeceptive requests for personal information and suspicious words orgroups of words, having a resemblance to a known fraudulent URL, aresemblance to a known phishing target in the title bar of a Web page,and/or any one of a suspicious message recipient, sender address, ordisplay name in a message or email. A typical “From” line in an email isof the form: “From: “My Name” myname@example.com”, and the portion “MyName” is called the “Display Name” and is typically displayed to a user.A phisher might send email: “From: “Security@DistricBank.com”badguy@stealmoney.com”, which may pass anti-spoofing checks if“stealmoney.com” has anti-spoofing technology installed (since the emailis not spoofed), and which might fool users because of the display nameinformation.

The detection module 220 can also be implemented to compute an editdistance to determine the similarity between two strings. Edit distanceis the number of insertions, deletions, and substitutions that would berequired to transform one string to another. For example,Disttricbnc.com has an edit distance of three (3) from DistricBank.combecause it would require one deletion (t), one insertion (a), and onesubstitution (k for c) to change Disttricbnc.com to DistricBank.com. A“human-centered” edit distance can be factored into detection module 220that includes less of an emphasis for some changes, such as for “c” to“k” and for the number “1” for the lower-case L-letter “l”. Otheremphasis factors can include doubling or undoubling letters (e.g., “tt”changed to “t”) as well as for certain wholesale changes such as “.com”changed to “.net”, or for other changes that are not likely to benoticed by a user, such as “Distric” changed to “District”.Additionally, the safe-list 224 of known false positives can bemaintained for legitimate domains that may otherwise be detected asfraudulent domains. For instance, it might be the case thatDistricBank.com is a large, legitimate bank and often a target ofphishers, while DistricBanc.com is a small, yet legitimate bank. It isimportant not to warn all users of DistricBanc.com that their emailappears to be fraudulent, and safe-listing is one example implementationto solve this.

The detection module 220 can be implemented to detect fraudulentmessages through the presence of links containing at least one of an IPaddress, an “@” symbol, or suspicious HTML encoding. Other detectablefeatures or aspects include whether an email message fails SenderID oranother anti-spoofing technology. The SenderID protocol is implementedto authenticate the sender of an email and attempts to identify an emailsender in an effort to detect spoofed emails. A Domain Name System (DNS)server maintains records for network domains, and when an email isreceived by an inbound mail server, the server can look up the publishedDNS record of the domain from which the email is originated to determinewhether an IP (Internet protocol) address of a service providercorresponding to the domain matches a network domain on record. An emailwith a spoofed (or faked) “From:” address, as detected by the SenderiL)protocol or other anti-spoofing protocol, is especially suspiciousalthough there may be legitimate reasons as to why this sometimeshappens. Email with a spoofed sender ID protocol is sometimes deleted,placed in a junk folder, or bounced, but may also be delivered by somesystems. The detection of spoofing can be implemented as an additionalinput to an anti-phishing system.

The detection module 220 can also be implemented to detect otherfraudulent or deceptive features or aspects of a message, such aswhether an email contains content known to be associated with phishing;is from a domain that does not provide anti-spoofing information; isfrom a newly established domain (i.e., phishing sites tend to be new);contains links to, or is a Web page in a domain that provides only asmall amount of content when the domain is indexed; contains links to,or is a Web page in a domain with a low search engine score or staticrank (or similar search engine query independent ranking score.Typically, a low static rank means that there are not many Web links tothe Web page which is typical of phishing pages, and not typical oflarge legitimate sites); and/or whether the Web page is hosted via aCable, DSL, or dialup communication link.

The detection module 220 can also be implemented to detect that databeing requested in an email or other type of message is personalidentifying information, such as if the text of the message includeswords or groups of words like “credit card number”, “Visa”,“MasterCard”, “expiration”, “social security”, and the like. Further,the detection module 220 can be implemented to detect that data beingsubmitted by a user is in the form of a credit card number, or matchesdata known to be personal identifying information, such as the last fourdigits of a social security number. In an embodiment, only a portion orhash of a user's social security number, credit card number, or othersensitive data can be stored so that if the computer is infected byspyware, the user's personal data can not be easily stolen.

The detection module 220 can also be implemented to utilize historicaldata pertaining to domains that have been in existence for adeterminable duration, and have not historically been associated withphishing or fraudulent activities. The detection module 220 can alsoinclude location dependent phishing lists and/or whitelists. Forexample, “Westpac” is a large Australian-based bank, but there may notbe a perceptible need to warn U.S. users about suspected phishingattacks on “Western Pacific University”. The detection implementation ofthe detection modules 220 can be more aggressive by implementinglocation and/or language dependent exclusions.

Methods for phishing detection, prevention, and notification aredescribed with reference to FIGS. 3, 5, 7, 8, and 9, and may bedescribed in the general context of computer executable instructions.Generally, computer executable instructions can include routines,programs, objects, components, data structures, procedures, modules,functions, and the like that perform particular functions or implementparticular abstract data types. The methods may also be practiced in adistributed computing environment where functions are performed byremote processing devices that are linked through a communicationsnetwork. In a distributed computing environment, computer executableinstructions may be located in both local and remote computer storagemedia, including memory storage devices. In addition, any one or moremethod blocks described with reference to one of the methods describedherein can be combined with any one or more method blocks described withreference to any other of the methods to implement various embodimentsof phishing detection, prevention, and notification.

FIG. 3 illustrates an exemplary method 300 for phishing email detection,prevention, and notification and is described with reference to theexemplary messaging system shown in FIG. 2. The order in which themethod is described is not intended to be construed as a limitation, andany number of the described method blocks can be combined in any orderto implement the method. Furthermore, the method can be implemented inany suitable hardware, software, firmware, or combination thereof.

At block 302, a communication is received from a domain. For example,messaging application 210 receives an email message from a domain, suchas the phishing Web site 208. At block 304, a messaging user interfaceis rendered to facilitate communication via a messaging application. Forexample, a messaging application 210 generates a messaging userinterface (e.g., email application user interface 108 shown in FIG. 1)such that a user at client device 204 can communicate via email or othersimilar messaging applications.

At block 306, each domain in the communication is compared to a list ofknown phishing domains to determine whether the communication is aphishing communication, based in part on the “From” domain of themessage compared to known phishing email senders and known phishingvictims, links in the communication, email addresses in thecommunication, and/or based on the content of the message. Severaldomains can be found in a communication or message. These include thedomain that the communication (e.g., email) is allegedly from, anyspecified reply-to domain (which may be different than the from domain),domains listed in a display name, domains in the text of the message,domains in links in the message, and domains in email addresses in themessage. For example, detection module 220 compares the domaincorresponding to the phishing Web site 208 to the list of known phishingdomains 222 or cached list 226 of known phishing domains.

At block 308, a phishing attack is detected in the communication atleast in part by determining that a domain in the communication issimilar to a known phishing domain. For example, the detection module220 determines that the domain corresponding to the phishing Web site208 is similar or included in the list of known phishing domains 222which is detected as a phishing attack. A known phishing domain caneither be a domain known to be used by phishers (e.g.,“DistricBank.biz”, or a known, legitimate domain targeted by phishers,such as “DistricBank.com”). For example, a “From” domain (which iseasily faked) of “DistricBank.com” combined with a link to“DistricBank.biz” would be highly suspicious.

The phishing attack can also be detected by the detection module 220when a name of the domain is similar in edit-distance to the knownphishing domain, and/or when the edit-distance is based at least in parton the likelihood of user confusion, or based at least in part on asite-specific change. The phishing attack can be detected as auser-selectable link within the received communication where theuser-selectable link includes an IP (Internet protocol) address, ansign, and/or suspicious HTML (Hypertext Markup Language) encoding. Thephishing attack can also be detected if the communication failsanti-spoofing detection, contains suspicious text content, is receivedfrom the domain which does not provide anti-spoofing information,contains a user-selectable link to a minimal amount of content, and/oris received via at least one of a dial-up, cable, or DSL (DigitalSubscriber Line) communication link.

The phishing attack can also be detected by the detection module 220 ifthe communication is received from a new domain, and/or if the contentincludes a user-selectable link to a Web-based resource. The phishingattack can also be detected when an IP (Internet protocol) addresscorresponding to the domain does not match the country where the domainis located. The phishing attack can also be detected if thecommunication includes a user-selectable link which includes link textand a mismatched URL (Uniform Resource Locator) If the receivedcommunication is an email message, the detection module 220 can examinedata and/or a name in a “From” field of the email to detect the phishingattack. In an event that an email is communicated from messagingapplication 210, the detection module 220 can detect a phishing attackby examining data in a “To” field of the email, a “CC” (carbon copy)field of the email, and/or a “BCC” (blind carbon copy) field of theemail.

FIG. 4 illustrates an exemplary Web browsing system 400 in whichembodiments of phishing detection, prevention, and notification can beimplemented. The system 400 includes a data center 402 and a clientdevice 404 configured for communication with data center 402 via acommunication network 406. The system 400 also includes a phishing Website 408 connected via the communication network 406 to the data center402 and/or to the client device 404.

In an embodiment, data center 402 can be implemented as server device102 shown in FIG. 1, any number of the client devices 104(1−N) can beimplemented as client device 404, and computing device 124 can beimplemented as phishing Web site 408. The data center 402 and/or clientdevice 404 may be implemented as any form of computing or electronicdevice with any number and combination of differing components asdescribed below with reference to the exemplary computing device 600shown in FIG. 6, and with reference to the exemplary computingenvironment 1000 shown in FIG. 10.

The client device 404 is an example of a Web browsing client thatincludes Web browsing application(s) 410 to generate a Web browser userinterface (e.g., Web browser user interface 110) for display on adisplay device 412. In this example, a user browsing the Web at clientdevice 404 may be enticed (e.g., when receiving a phishing email) tonavigate to a fraudulent or phishing Web page 414 hosted at the phishingWeb site 408. The phishing Web page is rendered on display 412 at clientdevice 404 as Web page 416 which is a user-interactive form throughwhich the unsuspecting user might enter personal and/or financialinformation, such as bank account information 418. The phishing Web page416 may also be deceptive in that a user intended to navigate to his orher bank, “DistricBank” as indicated on the Web page 416, when in factthe unsuspecting user has been directed to a fraudulent, phishing Webpage as indicated by the address “www.districbanc.com”.

The phishing Web page 416 contains an interactive form that includesvarious information fields that can be filled-in with user specific,private information via interaction with data input devices at clientdevice 404. Form 416 includes information fields 418 for a bank member'sname, account number, and a password, as well as several selectablefields that identify the type of banking accounts associated with theuser. When a user interacts with the phishing Web page 416 via clientdevice 404, a phisher can capture the personal and/or financialinformation 418 corresponding to the user and then use the informationfor monetary gain at the user's expense.

Client device 404 includes a detection module 420 that can beimplemented as a browsing toolbar plug-in for a Web browsing application410 to implement phishing detection, prevention, and notification. Thedetection module 420 can be implemented as any one or combination ofhardware, software, firmware, code, and/or logic in an embodiment ofphishing detection, prevention, and notification. Although detectionmodule 420 for the Web browsing application 410 is illustrated anddescribed as a single module or application, the detection module 420can be implemented as several component applications distributed to eachperform one or more functions of phishing detection, prevention, andnotification.

Detection module 420 can also be implemented as an integrated componentof a Web browsing application 410, rather than as a toolbar plug-inmodule. The detection module 420 can include algorithm(s) for thedetection of fraudulent and/or deceptive phishing Web sites and domains.The algorithms can be generated and/or updated at the data center 402,and then distributed to the client device 404 as an update to thedetection module 420.

Detection module 420 associated with a Web browsing application 410 isimplemented to detect phishing when a user interacts with the Webbrowsing application 410 through a Web browsing user interface (e.g.,Web browser user interface 110 shown in FIG. 1). Detection module 420associated with the Web browsing application 410 implements features forphishing detection, prevention, and notification of fraudulent,deceptive, and/or phishing Web sites.

Data center 402 maintains a list of known phishing Web sites andredirectors 422, as well as a false positive list 424 (or a whitelist)of known legitimate Web sites that have been deemed safe for userinteraction. The list of known phishing Web sites 422 includes a list ofknown bad URLs (e.g., URLs associated with phishing Web sites) and alist of ancestors of the known bad URLs. The data center 402 publishesthe list of known phishing Web sites and redirectors 422 to the clientdevice 404 which maintains the list as a cached list 426 of the knownphishing Web sites. Alternatively, and/or in addition, the client device404 can query the data center 402 about each URL the user visits, andcache the results of the queries. In some instances, a user may navigateto a phishing Web site 408 before the list of known phishing Web sites422 is updated at data center 402 to include the phishing Web site 408,and before the list is published to the client device 404.

The client device 404 includes a history of visited Web sites 428 whichwould indicate that a user interacting through client device 404 hasnavigated to phishing Web site 408. After the list of known phishing Websites 422 is updated at the data center 402 and/or after the data center402 publishes the list of known phishing Web sites 422 to the clientdevice 404, the history of visited Web sites 428 can be compared to thelist of known phishing Web sites 422 and/or to the cached list 426 ofthe known phishing Web sites to determine whether the user hasunknowingly visited the phishing Web site 408.

If it is determined after the fact that a user has visited a phishingWeb site, a warning message can be displayed to inform the user that thephishing Web site (or suspected phishing Web site) has been visited. Theuser can then make an informed decision about what to do next, such asif the user provided any personal or financial information while at thephishing Web site. This can give the user time to notify his or herbank, or other related business, of the information disclosure and thuspreclude fraudulent use of the information that may result from thedisclosure of the private information. Additionally, the detectionmodule 420 can determine for the user whether the private informationand/or other data was submitted, such as through an HTML form, and thenwarn the user if the private information was actually submitted ratherthan the user just visiting the phishing Web site.

Detection module 420 can query or access the cached list 426 of knownphishing Web sites maintained at client device 404, communicate a queryto data center 402 to determine if a Web site is a phishing Web sitefrom the list of known phishing Web sites 422, or both. This can beimplemented either by explicitly storing the user's history of visitedWeb sites 428, or by using the history already stored by a Web browsingapplication 410. A Web browsing application 410 can compare the historyof visited Web sites 428 to the updated cached list 426 of knownphishing Web sites. Alternatively, or in addition, the Web browsingapplication 410 can periodically communicate the list of recentlyvisited Web sites to poll an on-line phishing check at data center 402.

A user can be warned of a suspected phishing Web site, such as when Webpage 416 is rendered for user interaction. A user can be warned withmessages such as “Warning: this Web site contains an address name for“districbanc.com”, which, to the best of our knowledge, is notaffiliated with “Districbank”. Please use caution if submitting anypersonal or financial information about a DistricBank account.”

A user can also be warned about specific user-selectable navigationlinks in a Web page. For example, an IP (Internet Protocol) address maybe included in a Web page rather than a domain name because the domainname would have to be registered, and is likely traceable to the phisherthat registered the domain name. A user can be warned when clicking onan IP address link included on a Web page with a warning such as“Warning: the link you clicked on is an IP address. This kind of link isoften used by phishing scams. Be cautious if the Web page asks you forany personal or financial information.” IP address links are often usedin fraudulent email, but may also be used in legitimate email. Simplyblocking or allowing the user to visit a site does not provide the userwith enough information to consistently make the correct decision. Assuch, informing the user of the reason(s) for suspicion provides a userwith enough information to make an informed decision.

The detection module 420 can be implemented to detect various deceptiveand/or fraudulent aspects of Web pages. An example is a mismatch of thelink text and the URL corresponding to a phishing Web site that a useris being requested, or enticed, to visit. A Web site link can appear ashttp://www.DistricBank.com/security having the link text “DistricBank”,but which directs a user to a Web site, “StealYourMoney.com”. Anothercommon deception is a misuse of the “@” symbol in a URL. For example, aURL http://www.DistricBank.com@stealyourmoney.com directs a user to aWeb site “StealYourMoney.com”, and not to “DistricBank.com”.

The detection module 420 can also be implemented to detect a redirectorwhich is a URL that redirects a user from a first Web site to anotherWeb site. For example,http://www.WebSite.com/redirect?http://StealMoney.com first directs auser to “WebSite.com”, and then automatically redirects the user to“StealMoney.com”. Typically, a redirector includes two domains (e.g.“WebSite.com” and “StealMoney.com” in this example), and will likelyinclude an embedded “http://”. Redirectors are also used for legitimatereasons, such as to monitor click-through rates on advertising. As such,if a redirected site is included in a link (e.g., “StealMoney.com” inthis example), the redirected site can be compared to the list of knownor suspected phishing sites 422 maintained at data center 402.

The detection module 420 can also be implemented to detect a URL thathas been encoded to obfuscate the URL. For example, hexadecimalrepresentations can be substituted for other characters in a URL suchthat DistricB%41nk.com is equivalent to DistricBank.com. Some characterrepresentations are expected, such as an “_” (underscore), “˜” (tilde),or other character that may be encoded in a URL for a legitimate reason.However, encoding an alphabetic, numeric, or similar character may bedetected as fraudulent, and detection module 420 can be implemented toinitiate a warning to a user that indicates why a particular selectablelink, URL, or email address is likely fraudulent.

Detectable features of deceptive or fraudulent phishing include one ormore of an improper use of the “@” symbol, use of deceptive encoding,use of an IP address selectable link, use of a redirector, a mismatchbetween link text and the URL, and/or any combination thereof. Otherdetectable features of deceptive or fraudulent phishing includedeceptive requests for personal information and suspicious words orgroups of words, having a resemblance to a known fraudulent URL, and/ora resemblance to a known phishing target in the title bar of a Web page.

The detection module 420 can also be implemented to detect an editdistance to determine the similarity between two strings. Edit distanceis the number of insertions, deletions, and substitutions that would berequired to conform one string to another. For example, Disttricbnc.comhas an edit distance of three (3) from DistricBank.com because it wouldrequire one deletion (t), one insertion (a), and one substitution (k forc) to change Disttricbnc.com to DistricBank.com. A “human-centered” editdistance can be factored into detection module 420 that includes less ofan emphasis for some changes, such as for “c” to k” and/or the number“1” changed for the lower-case L-letter “l”. Other emphasis factors caninclude doubling or undoubling letters (e.g., “tt” changed to “t”) aswell as for certain wholesale changes such as “.com” changed to “.net”,or “Distric” changed to “District”. Additionally, a safe-list of knownfalse positives can be maintained for legitimate domains that mayotherwise be detected as fraudulent domains.

The detection module 420 can also be implemented to detect otherfraudulent or deceptive features or aspects of a phishing Web page, suchas whether a Web page contains content known to be associated withphishing; is from a newly established domain (i.e., phishing sites tendto be new); is from a domain that is seldom visited (has low traffic);is from a domain hosted by a Web hosting site; contains links to, or isa Web page in a domain that provides only a small amount of content whenthe domain is indexed; contains links to, or is a Web page in a domainwith a low search engine score or static rank (e.g., there are not manyWeb links to the Web page); and/or whether the Web page is hosted via aCable, DSL, or dialup communication link.

The detection module 420 for a Web browsing application 410 can beimplemented to detect other features or aspects that may indicate aphishing Web page, such as whether the Web page contains an obscuredform field; has a form field name that does not match what is posted onthe page; has a form field name that is not discernable by a user, suchas due to font size and/or color; has a URL that includes controlcharacters (i.e., those with ASCII codes between zero and thirty-one(0-31)); has a URL that includes unwise character encodings (e.g.,encodings in the path or authority section of a URL are typicallyunwise); includes HTML character encoding techniques in a URL (e.g.,includes a “&#xx” notation where “xx” is an ASCII code); has a URL thatincludes an IP version six address; and/or has a URL that includes aspace character which can be exploited.

A fraudulent, deceptive, or phishing Web page often includes content,such as images and text, from a legitimate Web site. To reduce bandwidthor for simplicity, a phishing Web page may be developed using pointersto images on a Web page at a legitimate Web site. It may also openwindows or use frames to directly display content from the legitimatesite. User-selectable links to legitimate Web pages may also beincluded, such as a link to a privacy policy at a legitimate Web site.The detection module 420 can be implemented to detect a fraudulent,deceptive, or phishing Web page that includes a large number of links toone other legitimate Web site, and particularly to a Web site that iscommonly spoofed, and which includes another selectable link that pointsto a different Web site, or contains a form that sends data to adifferent Web site.

The detection module 420 can also be implemented to detect that the databeing requested via a Web page is personal identifying information, suchas if the Web page includes words or groups of words like “credit cardnumber”, “Visa”, “MasterCard”, “expiration”, “social security”, and thelike, or if the form that collects the data contains a password-typefield. Further, the detection module 420 can be implemented to detectthat data being submitted by a user is in the form of a credit cardnumber, or matches data known to be personal identifying information,such as the last four digits of a social security number, or is likelyan account number, for example, if the data is many characters long andconsists entirely of numbers and punctuation.

Detection module 420 for a Web browsing application 410 can also beimplemented to detect that a Web page may be fraudulent if privateinformation is requested, yet there is no provision for submitting theinformation via HTTPS (secure HTTP). A phisher may not be able to obtainan HTTPS certificate which is difficult to do anonymously, and willforgo the use of HTTPS to obtain the private information.

Detection module 420 can also be implemented to determine the country orIP range in which a Web server is located to further detect phishing Websites on the basis of historical phishing behavior of that country or IPrange. This can be accomplished using any one or more of the associatedIP information, Whois information (e.g., to identify the owner of asecond-level domain name), and Traceroute information. The location of auser can be determined from an IP address, registration information,configuration information, and/or version information. The detectionmodule 420 for a Web browsing application 410 can also be implemented toutilize historical data pertaining to domains and/or Web pages that havebeen in existence for a determinable duration, and have not historicallybeen associated with phishing or fraudulent activities.

FIG. 5 illustrates an exemplary method 500 for phishing detection,prevention, and notification and is described with reference to theexemplary Web browsing system shown in FIG. 4. The order in which themethod is described is not intended to be construed as a limitation, andany number of the described method blocks can be combined in any orderto implement the method. Furthermore, the method can be implemented inany suitable hardware, software, firmware, or combination thereof.

At block 502, content is received from a network-based resource. Forexample, a Web browsing application 410 generates a Web browser userinterface (e.g., Web browser user interface 110 shown in FIG. 1) suchthat a user at client device 404 can request and receive Web pages andother information from a network-based resource, such as a Web site ordomain. At block 504, a user interface of a Web browsing application isrendered to display the content received from the network-basedresource.

At block 506, the domain is compared to a list of known phishingdomains. For example, detection module 420 compares the domaincorresponding to the phishing Web site 408 to the list of known phishingWeb sites 422 or cached list 426 of known phishing Web sites. The listof known phishing domains can be based on historical data correspondingto the known phishing domains. The domain can also be compared to a listof false positive domains and/or a whitelist to determine that thedomain is not a phishing domain.

At block 508, a phishing attack is detected in the content at least inpart by determining that a domain of the network-based resource issimilar to a known phishing domain. For example, the detection module420 determines that the domain corresponding to the phishing Web site408 is similar or included in the list of known phishing Web sites 422which is detected as a phishing attack.

The phishing attack can also be detected by the detection module 420when a name of the domain is similar in edit-distance to the knownphishing victim domain, and/or when the edit-distance is based at leastin part on the likelihood of user confusion, or based at least in parton a site-specific change. The phishing attack can be detected as auser-selectable link within the received content where theuser-selectable link includes an IP (Internet protocol) address, an “@”sign, and/or suspicious HTML (Hypertext Markup Language) encoding. Thephishing attack can also be detected if the content contains suspicioustext content, contains a user-selectable link to a minimal amount ofcontent, and/or is received via at least one of a dial-up, cable, or DSL(Digital Subscriber Line) communication link.

The phishing attack can also be detected by the detection module 420 ifthe content is received from a network-based resource which is a newdomain, if the Web page has a low static rank, and/or if the contentincludes multiple user-selectable links to an additional network-basedresource, and is configured to submit form data to a network-basedresource other than the additional network-based resource. At block 510,the phishing attack is determined not to be a phishing attack if thecontent can not return data to the domain, or to any other domain.

FIG. 6 illustrates various components of an exemplary computing device600 in which embodiments of phishing detection, prevention, andnotification can be implemented. For example, any one of client devices104(1−N) (FIG. 1), client devices 204 (FIG. 2) and 404 (FIG. 4), anddata centers 202 (FIG. 2) and 402 (FIG. 4) can be implemented ascomputing device 600 in the respective exemplary systems 200 and 400.Computing device 400 can also be implemented as any form of computing orelectronic device with any number and combination of differingcomponents as described below with reference to the exemplary computingenvironment 1000 shown in FIG. 10.

The computing device 600 includes one or more media content inputs 602which may include Internet Protocol (IP) inputs over which streams ofmedia content are received via an IP-based network. Computing device 600further includes communication interface(s) 604 which can be implementedas any one or more of a serial and/or parallel interface, a wirelessinterface, any type of network interface, and as any other type ofcommunication interface. A wireless interface enables computing device600 to receive control input commands and other information from aninput device, and a network interface provides a connection betweencomputing device 600 and a communication network (e.g., communicationnetwork 106 shown in FIG. 1) by which other electronic and computingdevices can communicate data with computing device 600.

Computing device 600 also includes one or more processors 606 (e.g., anyof microprocessors, controllers, and the like) which process variouscomputer executable instructions to control the operation of computingdevice 600, to communicate with other electronic and computing devices,and to implement embodiments of phishing detection, prevention, andnotification. Computing device 600 can be implemented with computerreadable media 608, examples of which include random access memory(RAM), non-volatile memory (e.g., any one or more of a read-only memory(ROM), flash memory, EPROM, EEPROM, etc.), and a disk storage device. Adisk storage device can include any type of magnetic or optical storagedevice, such as a hard disk drive, a recordable and/or rewriteablecompact disc (CD), a DVD, a DVD+RW, and the like.

Computer readable media 608 provides data storage mechanisms to storevarious information and/or data such as software applications and anyother types of information and data related to operational aspects ofcomputing device 600. For example, an operating system 610, variousapplication programs 612, the Web browsing application(s) 410, themessaging application(s) 210, and the detection modules 220 and 420 canbe maintained as software applications with the computer readable media608 and executed on processor(s) 606 to implement embodiments ofphishing detection, prevention, and notification. In addition, thecomputer readable media 608 can be utilized to maintain the history ofvisited Web sites 428, the message history 228, and the cached lists 226and 426 for the various client devices which can be implemented ascomputing device 600.

As shown in FIG. 6, a Web browsing application 410 and a messagingapplication 210 are configured to communicate to further implementvarious embodiments of phishing detection, prevention, and notification.The messaging application 210 can notify the Web browsing application410 when Web-based content (e.g., a Web page) is requested via aselectable link within an email message. In one embodiment, themessaging application 210 (via detection module 220) may have detectedor determined fraudulent or suspected phishing content in a message, andcan communicate a notification to the Web browsing application 410. Thedetection modules 220 and/or 420 can warn a user to prevent fraud basedat least in part on whether a user arrived at a current Web pagedirectly or indirectly via an email message or other messaging system.

In an embodiment, the various application programs 612 can include amachine learning component to implement features of phishing detection,prevention, and notification. A detection module 220 and/or 420 canimplement the machine learning component to determine whether a Web pageor message is suspicious or contains phishing content. Inputs to amachine learning module can include the full text of a Web page, thesubject line and body of an email message, any inputs that can beprovided to a spam detector, and/or the title bar of the Web page.Additionally, the machine learning component can implemented withdiscriminative training.

Computing device 600 also includes audio and/or video input/outputs 614that provide audio and/or video to an audio rendering and/or displaydevice 616, or to other devices that process, display, and/or otherwiserender audio, video, and display data. Video signals and audio signalscan be communicated from computing device 600 to the display device 616via an RF (radio frequency) link, S-video link, composite video link,component video link, analog audio connection, or other similarcommunication links. A warning message 618 can be generated for displayon display device 616. The warning message 618 is merely exemplary, andany type of warning, be it text, graphic, audible, or any combinationthereof, can be generated to warn a user of a possible phishing attack.

Although shown separately, some of the components of computing device600 may be implemented in an application specific integrated circuit(ASIC). Additionally, a system bus (not shown) typically connects thevarious components within computing device 600. A system bus can beimplemented as one or more of any of several types of bus structures,including a memory bus or memory controller, a peripheral bus, anaccelerated graphics port, or a local bus using any of a variety of busarchitectures.

FIG. 7 illustrates an exemplary method 700 for phishing detection,prevention, and notification. The order in which the method is describedis not intended to be construed as a limitation, and any number of thedescribed method blocks can be combined in any order to implement themethod. Furthermore, the method can be implemented in any suitablehardware, software, firmware, or combination thereof.

At block 702, a communication is received from a messaging applicationthat content has been requested via a messaging application. Forexample, a messaging application 210 (FIG. 2) can utilize a referringpage, a URI (Uniform Resource Identifier), a Web browser switch, or aWeb browser APT (Application Program Interface) to communicate with theWeb browsing application 410 that Web-based content has been requestedvia the messaging application 210.

At block 704, the content is received from a network-based resource. Forexample, Web browsing application 410 (FIG. 4) generates a Web browseruser interface (e.g., Web browser user interface 110 shown in FIG. 1)such that a user at client device 404 can request and receive Web pagesand other information from a network-based resource, such as a Web siteor domain. At block 706, a user interface of a Web browsing applicationis rendered to display the content received from the network-basedresource.

Typically, phishing attacks are conducted by a communication beingreceived by a user instructing the user to visit a Web page. A user canarrive at web pages in many ways, such as from a favorites list, bysearching the Internet, and the like, most of which do not typicallyprecede browsing to a Web page that conducts a phishing attack. For aWeb-browsing phishing detector, knowing that a Web page being viewed wasreached via a messaging application is a feature of phishing detection,prevention, and notification. The Web pages not reached via a messagingapplication can either be presumed to be safe, or the degree ofsuspicion of a Web page can be reduced if the Web page was not reachedvia a messaging application.

In addition, a messaging application may have its own degree ofsuspicion of the originating message. For instance, an originatingmessage that fails a SenderID check would be highly suspicious. Anoriginating message from a trusted sender that passed a SenderID checkmight be considered safe. The messaging application can communicate itsdegree of suspicion or related information to a Web-browsing phishingdetector. If the Web-browsing phishing detector then detects furthersuspicious indications, these can be used in combination with thecommunications from the messaging application to determine anappropriate course of action, such as warning that the content maycontain a phishing attack.

At block 708, a phishing attack is prevented when the content isreceived from the network-based resource in response to a request forthe content from the messaging application. For example, detectionmodule 420 can determine that the request for the content originatedfrom messaging application 410 via a referring page and a list of knownWeb-based email systems. A suspicion score may also be obtained from themessaging application where the suspicion score indicates a likelihoodof a phishing attack. The phishing attack can also be prevented bycombining the suspicion score with phishing information corresponding tothe network-based resource to further determine the likelihood of thephishing attack.

At block 710, a warning is communicated to a user via the user interfacethat the content may contain a phishing attack. Alternatively and/or inaddition at block 712, a warning is communicated to the user via themessaging application that the content may contain a phishing attack.For example, a warning can be rendered for viewing via a user interfacedisplay, or a warning can be communicated to a user as an email message,for example

FIG. 8 illustrates an exemplary method 800 for phishing detection,prevention, and notification. The order in which the method is describedis not intended to be construed as a limitation, and any number of thedescribed method blocks can be combined in any order to implement themethod. Furthermore, the method can be implemented in any suitablehardware, software, firmware, or combination thereof.

At block 802, content is received from a network-based resource. Forexample, a Web browsing application 410 (FIG. 4) generates a Web browseruser interface (e.g., Web browser user interface 110 shown in FIG. 1)such that a user at client device 404 can request and receive Web pagesand other information from a network-based resource, such as a Web siteor domain. At block 804, a user interface of a Web browsing applicationis rendered to display the content received from the network-basedresource.

At block 806, a suspicious user-selectable link is detected in thecontent. For example, the detection module 420 (FIG. 4) can detect thata suspicious user-selectable link may be a link to an additionalnetwork-based resource, a URL (Uniform Resource Locator), and/or anemail address. The user-selectable link can be detected as being similarto a known fraudulent target, as including suspicious text content,and/or including suspicious text content in a title bar of the userinterface of the Web browsing application.

At block 808, a warning is generated that explains why theuser-selectable link is suspicious. For example, the detection module420 can initiate that a warning be generated to explain a differencebetween a valid user-selectable link and the suspicious user-selectablelink. The warning can also be generated to explain that theuser-selectable link includes an “@” sign, suspicious encoding, an IP(Internet Protocol) address, a redirector, and/or link text and amismatched URL (Uniform Resource Locator).

FIG. 9 illustrates an exemplary method 900 for phishing detection,prevention, and notification and is described with reference to anexemplary client device and/or data center (e.g., server device), suchas shown in FIGS. 2-3. The order in which the method is described is notintended to be construed as a limitation, and any number of thedescribed method blocks can be combined in any order to implement themethod. Furthermore, the method can be implemented in any suitablehardware, software, firmware, or combination thereof.

At block 902, a messaging user interface is rendered to facilitatecommunication via a messaging application. For example, a messagingapplication 210 generates a messaging user interface (e.g., emailapplication user interface 108 shown in FIG. 1) such that a user atclient device 204 can communicate via email or other similar messagingapplications. At block 904, a communication is received from a domain.For example, messaging application 210 receives an email message from adomain, such as the phishing Web site 208.

At block 906, a suspicious user-selectable link is detected in thecommunication. For example, the detection module 220 (FIG. 2) can detectthat a suspicious user-selectable link may be any one of a network-basedresource, a URL (Uniform Resource Locator), and/or an email address. Theuser-selectable link can be detected as being similar to a knownfraudulent target, or can be included as part of a suspicious senderaddress or display name.

At block 908, a warning is generated that explains why theuser-selectable link is suspicious. For example, the detection module220 can initiate that a warning be generated to explain a differencebetween a valid user-selectable link and the suspicious user-selectablelink. Further, the warning can be generated to explain that theuser-selectable link includes an “@” sign, suspicious encoding, an IP(Internet Protocol) address, a redirector, and/or link text and amismatched URL (Uniform Resource Locator).

FIG. 10 illustrates an exemplary computing environment 1000 within whichsystems and methods for phishing detection, prevention, andnotification, as well as the computing, network, and systemarchitectures described herein, can be either fully or partiallyimplemented. Exemplary computing environment 1000 is only one example ofa computing system and is not intended to suggest any limitation as tothe scope of use or functionality of the architectures. Neither shouldthe computing environment 1000 be interpreted as having any dependencyor requirement relating to any one or combination of componentsillustrated in the exemplary computing environment 1000.

The computer and network architectures in computing environment 1000 canbe implemented with numerous other general purpose or special purposecomputing system environments or configurations. Examples of well knowncomputing systems, environments, and/or configurations that may besuitable for use include, but are not limited to, personal computers,server computers, client devices, hand-held or laptop devices,microprocessor-based systems, multiprocessor systems, set top boxes,programmable consumer electronics, network PCs, minicomputers, mainframecomputers, gaming consoles, distributed computing environments thatinclude any of the above systems or devices, and the like.

The computing environment 1000 includes a general-purpose computingsystem in the form of a computing device 1002. The components ofcomputing device 1002 can include, but are not limited to, one or moreprocessors 1004 (e.g., any of microprocessors, controllers, and thelike), a system memory 1006, and a system bus 1008 that couples thevarious system components. The one or more processors 1004 processvarious computer executable instructions to control the operation ofcomputing device 1002 and to communicate with other electronic andcomputing devices. The system bus 1008 represents any number of severaltypes of bus structures, including a memory bus or memory controller, aperipheral bus, an accelerated graphics port, and a processor or localbus using any of a variety of bus architectures.

Computing environment 1000 includes a variety of computer readable mediawhich can be any media that is accessible by computing device 1002 andincludes both volatile and non-volatile media, removable andnon-removable media. The system memory 1006 includes computer readablemedia in the form of volatile memory, such as random access memory (RAM)1010, and/or non-volatile memory, such as read only memory (ROM) 1012. Abasic input/output system (BIOS) 1014 maintains the basic routines thatfacilitate information transfer between components within computingdevice 1002, such as during start-up, and is stored in ROM 1012. RAM1010 typically contains data and/or program modules that are immediatelyaccessible to and/or presently operated on by one or more of theprocessors 1004.

Computing device 1002 may include other removable/non-removable,volatile/non-volatile computer storage media. By way of example, a harddisk drive 1016 reads from and writes to a non-removable, non-volatilemagnetic media (not shown), a magnetic disk drive 1018 reads from andwrites to a removable, non-volatile magnetic disk 1020 (e.g., a “floppydisk”), and an optical disk drive 1022 reads from and/or writes to aremovable, non-volatile optical disk 1024 such as a CD-ROM, digitalversatile disk (DVD), or any other type of optical media. In thisexample, the hard disk drive 1016, magnetic disk drive 1018, and opticaldisk drive 1022 are each connected to the system bus 1008 by one or moredata media interfaces 1026. The disk drives and associated computerreadable media provide non-volatile storage of computer readableinstructions, data structures, program modules, and other data forcomputing device 1002.

Any number of program modules can be stored on RAM 1010, ROM 1012, harddisk 1016, magnetic disk 1020, and/or optical disk 1024, including byway of example, an operating system 1028, one or more applicationprograms 1030, other program modules 1032, and program data 1034. Eachof such operating system 1028, application program(s) 1030, otherprogram modules 1032, program data 1034, or any combination thereof, mayinclude one or more embodiments of the systems and methods describedherein.

Computing device 1002 can include a variety of computer readable mediaidentified as communication media. Communication media typicallyembodies computer readable instructions, data structures, programmodules, or other data in a modulated data signal such as a carrier waveor other transport mechanism and includes any information deliverymedia. The term “modulated data signal” refers to a signal that has oneor more of its characteristics set or changed in such a manner as toencode information in the signal. By way of example and not limitation,communication media includes wired media such as a wired network ordirect-wired connection, and wireless media such as acoustic, RF,infrared, other wireless media, and/or any combination thereof.

A user can interface with computing device 1002 via any number ofdifferent input devices such as a keyboard 1036 and pointing device 1038(e.g., a “mouse”). Other input devices 1040 (not shown specifically) mayinclude a microphone, joystick, game pad, controller, satellite dish,serial port, scanner, and/or the like. These and other input devices areconnected to the processors 1004 via input/output interfaces 1042 thatare coupled to the system bus 1008, but may be connected by otherinterface and bus structures, such as a parallel port, game port, and/ora universal serial bus (USB).

A display device 1044 (or other type of monitor) can be connected to thesystem bus 1008 via an interface, such as a video adapter 1046. Inaddition to the display device 1044, other output peripheral devices caninclude components such as speakers (not shown) and a printer 1048 whichcan be connected to computing device 1002 via the input/outputinterfaces 1042.

Computing device 1002 can operate in a networked environment usinglogical connections to one or more remote computers, such as remotecomputing device 1050. By way of example, remote computing device 1050can be a personal computer, portable computer, a server, a router, anetwork computer, a peer device or other common network node, and thelike. The remote computing device 1050 is illustrated as a portablecomputer that can include any number and combination of the differentcomponents, elements, and features described herein relative tocomputing device 1002.

Logical connections between computing device 1002 and the remotecomputing device 1050 are depicted as a local area network (LAN) 1052and a general wide area network (WAN) 1054. Such networking environmentsare commonplace in offices, enterprise-wide computer networks,intranets, and the Internet. When implemented in a LAN networkingenvironment, the computing device 1002 is connected to a local network1052 via a network interface or adapter 1056. When implemented in a WANnetworking environment, the computing device 1002 typically includes amodem 1058 or other means for establishing communications over the widearea network 1054. The modem 1058 can be internal or external tocomputing device 1002, and can be connected to the system bus 1008 viathe input/output interfaces 1042 or other appropriate mechanisms. Theillustrated network connections are merely exemplary and other means ofestablishing communication link(s) between the computing devices 1002and 1050 can be utilized.

In a networked environment, such as that illustrated with computingenvironment 1000, program modules depicted relative to the computingdevice 1002, or portions thereof, may be stored in a remote memorystorage device. By way of example, remote application programs 1060 aremaintained with a memory device of remote computing device 1050. Forpurposes of illustration, application programs and other executableprogram components, such as operating system 1028, are illustratedherein as discrete blocks, although it is recognized that such programsand components reside at various times in different storage componentsof the computing device 1002, and are executed by the one or moreprocessors 1004 of the computing device 1002.

Although embodiments of phishing detection, prevention, and notificationhave been described in language specific to structural features and/ormethods, it is to be understood that the subject of the appended claimsis not necessarily limited to the specific features or methodsdescribed. Rather, the specific features and methods are disclosed asexemplary implementations of phishing detection, prevention, andnotification.

1. A system, comprising a phishing detection module configured tocompare a history of user activity to a list of known phishing domains,and further configured to initiate a warning if a domain similar to aknown phishing domain is included in the history of user activity.
 2. Asystem as recited in claim 1, wherein the phishing detection module is acomponent of a Web browsing application, and is further configured toinitiate the warning to a user to identify a visited Web site as aphishing Web site.
 3. A system as recited in claim 1, wherein thephishing detection module is a component of a Web browsing application,and is further configured to maintain a history of data submitted via aWeb browsing user interface to one or more visited Web sites.
 4. Asystem as recited in claim 1, wherein the phishing detection module is acomponent of a Web browsing application, is further configured tomaintain a history of data submitted via a Web browsing user interfaceto one or more visited Web sites, and is further configured to initiatethe warning to a user in an event that data is submitted to a phishingWeb site.
 5. A system as recited in claim 1, wherein the phishingdetection module is a component of a messaging application, and isfurther configured to initiate the warning to a user in an event that asuspected phishing communication is rendered for viewing.
 6. A system asrecited in claim 1, wherein the phishing detection module is a componentof a messaging application, and is further configured to initiate thewarning to a user in an event that the user replies to a phishingcommunication.
 7. A system as recited in claim 1, wherein the phishingdetection module is a component of a messaging application, and isfurther configured to initiate the warning to a user in an event thatthe user communicates a message to a phishing address.
 8. A system asrecited in claim 1, wherein the phishing detection module is a componentof a messaging application, and is further configured to receive awarning message in an event that private information is potentiallydisclosed.
 9. A system as recited in claim 1, further comprising anemail server configured to communicate a warning message to themessaging application in an event that private information ispotentially disclosed.
 10. A method, comprising: receiving content froma network-based resource; rendering a user interface of a Web browsingapplication to display the content received from the network-basedresource; detecting a suspicious user-selectable link in the contentthat is a link to at least one of an additional network-based resource,a URL (Uniform Resource Locator), or an email address; and generating awarning that explains why the user-selectable link is suspicious.
 11. Amethod as recited in claim 10, wherein generating the warning includesgenerating the warning to explain that the user-selectable link includesat least one of an “@” sign, suspicious encoding, an IP (InternetProtocol) address, a redirector, a link text and a mismatched URL(Uniform Resource Locator), or that the user-selectable link is a knownphishing site.
 12. A method as recited in claim 10, wherein detectingthe suspicious user-selectable link includes detecting that theuser-selectable link is similar to a known fraudulent target.
 13. Amethod as recited in claim 10, wherein generating the warning includesgenerating the warning to explain a difference between a validuser-selectable link and the suspicious user-selectable link.
 14. Amethod as recited in claim 10, wherein detecting the suspicioususer-selectable link includes detecting that the user-selectable linkincludes suspicious text content.
 15. A method as recited in claim 10,wherein detecting the suspicious user-selectable link includes detectingthat the user-selectable link includes suspicious text content in atitle bar of the user interface of the Web browsing application.
 16. Amethod, comprising: rendering a messaging user interface to facilitatecommunication via a messaging application; receiving a communicationfrom a domain; detecting a suspicious user-selectable link in thecommunication that is a link to at least one of a network-basedresource, a URL (Uniform Resource Locator), or an email address; andgenerating a warning that explains why the user-selectable link issuspicious.
 17. A method as recited in claim 16, wherein generating thewarning includes generating the warning to explain that theuser-selectable link includes at least one of an “@” sign, suspiciousencoding, an IP (Internet Protocol) address, a redirector, or link textand a mismatched URL (Uniform Resource Locator).
 18. A method as recitedin claim 16, wherein detecting the suspicious user-selectable linkincludes detecting that the user-selectable link is similar to a knownfraudulent target.
 19. A method as recited in claim 16, whereingenerating the warning includes generating the warning to explain adifference between a valid user-selectable link and the suspicioususer-selectable link.
 20. A method as recited in claim 16, whereindetecting the suspicious user-selectable link includes detecting thatthe user-selectable link includes at least one of a suspicious senderaddress, or display name.